Recurrent Coupled Topic Modeling over Sequential Documents

نویسندگان

چکیده

The abundant sequential documents such as online archival, social media, and news feeds are streamingly updated, where each chunk of is incorporated with smoothly evolving yet dependent topics. Such digital texts have attracted extensive research on dynamic topic modeling to infer hidden topics their temporal dependencies. However, most the existing approaches focus single-topic-thread evolution ignore fact that a current may be coupled multiple relevant prior In addition, these also incur intractable inference problem when inferring latent parameters, resulting in high computational cost performance degradation. this work, we assume evolves from all corresponding coupling weights, forming multi-topic-thread . Our method models dependencies between thoroughly encodes complex multi-couplings across time steps. To conquer challenge, new solution set novel data augmentation techniques proposed, which successfully discomposes A fully conjugate model thus obtained guarantee effectiveness efficiency technique. Gibbs sampler backward–forward filter algorithm efficiently learns time-evolving parameters closed-form. Indian Buffet Process compound distribution exploited automatically overall number customize sparse proportions for document without bias. proposed evaluated both synthetic real-world datasets against competitive baselines, demonstrating its superiority over baselines terms low per-word perplexity, coherent topics, better prediction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Modeling in Financial Documents

This paper describes the application of topic modeling techniques to quarterly earnings call transcripts of publicly traded companies. Earnings call transcripts represent an interesting case for analysis because the document is relatively unstructured and potentially more informative than 10K and 10Q disclosures due to the question and answer session consisting of unprepared statements. This pa...

متن کامل

Topic Modeling for Segment-based Documents

Statistical topic models have traditionally assumed that a document is an indivisible unit for the generative process, which may not be appropriate to handle documents that are relatively long and show an explicit multi-topic structure. In this paper we describe a generative model that exploits a given decomposition of documents in smaller, topically cohesive text units, or segments. The key-id...

متن کامل

Sequential Recurrent Neural Networks for Language Modeling

Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In partic...

متن کامل

ScienceWISE: Topic Modeling over Scientific Literature Networks

We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivi...

متن کامل

Modeling corpora of timestamped documents using semisupervised nonparametric topic models

In this paper we propose a nonparametric topic model to capture the evolution of text over time. Mixture models for modeling text documents based on hierarchical Dirichlet processes (HDP) have been used successfully in recent work to provide a nonparametric prior for the number of topics in the corpus eliminating the need to specify apriori the number of topics. We extend this model to addition...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Knowledge Discovery From Data

سال: 2021

ISSN: ['1556-472X', '1556-4681']

DOI: https://doi.org/10.1145/3451530